Bagging and Boosting for the Nearest Mean Classifier: Effects of Sample Size on Diversity and Accuracy
نویسندگان
چکیده
In combining classifiers, it is believed that diverse ensembles perform better than non-diverse ones. In order to test this hypothesis, we study the accuracy and diversity of ensembles obtained in bagging and boosting applied to the nearest mean classifier. In our simulation study we consider two diversity measures: the Q statistic and the disagreement measure. The experiments, carried out on four data sets have shown that both diversity and the accuracy of the ensembles depend on the training sample size. With exception of very small training sample sizes, both bagging and boosting are more useful when ensembles consist of diverse classifiers. However, in boosting the relationship between diversity and the efficiency of ensembles is much stronger than in bagging.
منابع مشابه
An experimental study on diversity for bagging and boosting with linear classifiers
In classifier combination, it is believed that diverse ensembles have a better potential for improvement on the accuracy than nondiverse ensembles. We put this hypothesis to a test for two methods for building the ensembles: Bagging and Boosting, with two linear classifier models: the nearest mean classifier and the pseudo-Fisher linear discriminant classifier. To estimate diversity, we apply n...
متن کاملImproving reservoir rock classification in heterogeneous carbonates using boosting and bagging strategies: A case study of early Triassic carbonates of coastal Fars, south Iran
An accurate reservoir characterization is a crucial task for the development of quantitative geological models and reservoir simulation. In the present research work, a novel view is presented on the reservoir characterization using the advantages of thin section image analysis and intelligent classification algorithms. The proposed methodology comprises three main steps. First, four classes of...
متن کاملExamining the Relationship Between Majority Vote Accuracy and Diversity in Bagging and Boosting
Much current research is undertaken into combining classifiers to increase the classification accuracy. We show, by means of an enumerative example, how combining classifiers can lead to much greater or lesser accuracy than each individual classifier. Measures of diversity among the classifiers taken from the literature are shown to only exhibit a weak relationship with majority vote accuracy. ...
متن کاملExamining the Relationship Between Majority Vote Ac - curacy and Diversity in Bagging and
Much current research is undertaken into combining classifiers to increase the classification accuracy. We show, by means of an enumerative example, how combining classifiers can lead to much greater or lesser accuracy than each individual classifier. Measures of diversity among the classifiers taken from the literature are shown to only exhibit a weak relationship with majority vote accuracy. ...
متن کاملMalware Detection using Classification of Variable-Length Sequences
In this paper, a novel method based on the graph is proposed to classify the sequence of variable length as feature extraction. The proposed method overcomes the problems of the traditional graph with variable length of data, without fixing length of sequences, by determining the most frequent instructions and insertion the rest of instructions on the set of “other”, save speed and memory. Acco...
متن کامل